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Abstract — We study the design of amplitude phase-shift keying 
(APSK) constellations for a coherent fiber-optical communication 
system where nonlinear phase noise (NLPN) is the main system 
impairment. APSK constellations can be regarded as a union 
of phase-shift keying (PSK) signal sets with different amplitude 
levels. A practical two-stage (TS) detection scheme is analyzed, 
which performs close to optimal detection for high enough input 
power. We optimize APSK constellations with 4, 8, and 16 points 
in terms of symbol error probability (SEP) under TS detection for 
several combinations of input power and fiber length. Our results 
show that APSK is a promising modulation format in order to 
cope with NLPN. As an example, for 16 points, performance 
gains of 3.2 dB can be achieved at a SEP of 10~ 2 compared to 
16-QAM by choosing an optimized APSK constellation. We also 
demonstrate that in the presence of severe nonlinear distortions, 
it may become beneficial to sacrifice a constellation point or an 
entire constellation ring to reduce the average SEP. Finally, we 
discuss the problem of selecting a good binary labeling for the 
found constellations. For the class of rectangular APSK a labeling 
design method is proposed, resulting in near-optimal bit error 
probability. 

Index Terms — APSK constellation, binary labeling, nonlinear 
phase noise, optical Kerr-effect, self-phase modulation. 



I. Introduction 

Fiber nonlinearities are considered to be one of the lim- 
iting factors for achieving high data rates in coherent long- 
haul optical transmission systems |TJ-||3). Therefore, a good 
understanding of the influence of nonlinearities on the system 
behavior is necessary in order to increase data rates of future 
optical transmission systems. 

The optimal design of a signal constellation, i.e., placing 
M points in the complex plane such that the symbol error 
probability (SEP) is minimized under an average or peak 
power constraint, can be considered a classical problem in 
communication theory [4 Ch. 1]. However, so far, only little 
is known about the influence of fiber nonlinearities on the 
optimal choice of the signal set. In this paper, we consider 
signal constellation design for a nonlinear fiber-optical channel 
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model. We focus on a specific class of constellations called 
amplitude phase-shift keying (APSK), which can be defined 
as the union of phase-shift keying (PSK) signal sets with 
different amplitude levels. This choice is motivated by the 
fact that these constellations are employed in recent satellite 
communication standards due to their flexibility and robustness 
against nonlinear amplifier distortions, see for example |5 1, |6|, 
(7] pp. 27-28] and references therein. 

The input-output relationship of the fiber-optical baseband 
channel is given implicitly in the form of the stochastic 
nonlinear Schrodinger equation 18] Ch. 2]. It is well recognized 
that this type of channel model does not lend itself to an easy 
solution for various communication theoretic problems |9), 
pO] . We pursue a "bottom-up" approach for the signal con- 
stellation design by considering a simplified, dispersion-less 
channel model. This model captures the interaction of Kerr- 
nonlinearities with the signal itself and the inline amplified 
spontaneous emission (ASE) noise, giving rise to nonlinear 
phase noise (NLPN) JTTJ. The discrete channel model is 
obtained from the waveform channel on a per-sample basis 



1 12 Sec. Ill] and the probability density function (PDF) was 
derived using several different methods in (3j, (T2j-|l4j. 

Signal constellation design and detection in the presence of 
NLPN has been studied previously in [15] and [16]. In fT5) , 
the authors applied several predistortion and postcompensation 
techniques in combination with minimum-distance detection 
for quadrature amplitude modulation (QAM) to mitigate the 
effect of NLPN. They also proposed a two-stage (TS) detector 
consisting of a radius detector, an amplitude-dependent phase 
rotation, and a phase detector. Moreover, parameter optimiza- 
tion was performed with respect to four 4-point, custom 
constellations under maximum likelihood (ML) detection. In 
JT6| , the TS detector was used to optimize the radii of four 
16-point constellations for two power regimes. It was shown in 
[ 16 1 that the optimal radii highly depend on the transmit power. 
In |10|, p7) , a capacity analysis is presented for a fiber-optical 
channel model. The authors optimize the number of rings, the 
radii, and the occupation probabilities of continuous-input ring 
constellations, where the objective function is given by the 
mutual information. 

In this paper, we first analyze in detail the (suboptimal) TS 
detector developed in fl5) . We regard radius detection and 
phase rotation as a separate processing block that yields a 
postcompensated observation and we explain how to derive the 
corresponding PDF for constellations with multiple amplitude 
levels. To the best of our knowledge, this PDF has not been 
previously derived, possibly due to the fact that the SEP under 
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TS detection can be calculated with a simplified PDF |18|. 
The new PDF is used to gain insights into the performance 
behavior of the TS detector compared to optimal detection. We 
also show that this PDF is necessary to accurately calculate 
the average bit error probability (BEP) of the constellation. 

We then find optimal APSK constellations in terms of 
SEP under TS detection for a given input power and fiber 
length. In contrast to fl6) , we optimize the number of rings, 
the number of points per ring, as well as the radii. For the 
case M = 4, we choose identical system parameters as in 
JT3) and a comparison reveals that our approach results in 
similar, sometimes better, constellations, with the advantage 
of much less computational design complexity. This allows 
us to extend the optimization to M = 8 and M = 16. For 
the latter case, our results show that the widely employed 16- 
QAM constellation has poor performance compared to the best 
found constellations over a wide range of input powers for 
this channel model and detector. We also provide numerical 
support for the phenomenon of sacrificial points or satellite 
constellations, which arise in the context of constellation 
optimization in the presence of very strong nonlinearities 1 19 1, 
pO) . Our findings show, somewhat counterintuitively, that it 
is sometimes optimal to place a constellations point (or even 
an entire constellation ring) far away from all other points in 
order to improve the average performance of the constellation. 

Due to the separation of a hard-decision symbol detector 
and subsequent error correction in state-of-the art fiber-optical 
communication systems, the uncoded BEP is an important 
figure of merit. Therefore, we also address the problem of 
choosing a good binary labeling for APSK constellations in 
the presence of NLPN. We pay special attention to a class of 
APSK constellation which allows the use of a Gray labeling, 
which we call rectangular APSK. For this class, we propose a 
method to choose the phase offsets of the constellation rings 
resulting in near-optimal performance. Finally, we discuss the 
problem of finding labelings for constellations that do not fall 
into the above class. 

The remainder of the paper is organized as follows. In 
Sec. |IlJ we present the channel model and define the generic 
APSK signal set. In Sec. Ill we briefly review ML detection 
and then describe and analyze the suboptimal TS detector 
together with the corresponding PDF. The results of the 
constellation optimization with respect to SEP are presented 
and discussed in Sec. |IV] Binary labelings are discussed in 



TABLE I 

Constants and Parameters taken from (T3J 



Sec. [V] Concluding remarks can be found in Sec. VI 



II. System Model 

A. Channel 

We consider the discrete memoryless channel (3| Ch. 5] 

Y = (X + Z)e-^ m ' , (1) 

where X E X denotes the complex channel input, X the 
signal constellation, Z the total additive noise, Y the channel 
observation, and <1>nl tne NLPN, which is given by |3] Ch. 5] 
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Fig. 1. Scatter plots of Y assuming X £ A"i6_qam f° r several combinations 
of transmit power P and fiber length L. 



In (E), 7 is the nonlinear Kerr-parameter, L is the total length 
of the fiber, K denotes the number of fiber segments, and Z t 
is the noise contribution of all fiber segments up to segment 
i. More precisely, Zi = N\ + . . . + iV, is defined as the 
sum of i independent and identically distributed complex 
Gaussian random variables with zero mean and variance ctq 
per dimension (real and imaginary parts). The total additive 
noise of all K fiber segments is denoted by Z = Zk, which 
has variance a 2 = E [|Z| 2 ] = 2Ka\, where £[•] is the 
expected value. For ideal distributed amplification, we consider 
the case K — > oo. The total noise variance can be calculated 
as <7 2 = 2n sp hvai\vL (MO Sec. IX-B], where all parameters 
are taken from [15] and summarized in Table [I] The additive 
noise power spectral density as defined in [fl0[ is then given by 



X 



(2) 



N = n sp hva = 1.04 • 10" 20 W/km/Hz. Note that the total 
additive noise variance scales linearly with the fiber length. 

An important aspect of this channel model is the fact that 
the variance of the phase noise is dependent on the channel 
input (cf. (|2]i), or equivalently on the average transmit power 
P, defined as P = E [|X| 2 ] . In order to gain some insight 
into the qualitative behavior of the channel, in Fig. [T] we 
show received scatter plots assuming X € X\(,-qam, where 
^16-QAM = {y/P/10(a + jb) : a,6g{±l,±3}} is the 16- 
QAM constellation, and K = 100 for several combinations 
of P and L. It can be observed that for very low input 
power and fiber length, nonlinearities are negligible and the 
channel behaves as a standard additive white Gaussian noise 
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(AWGN) channel. The scatter plots along a diagonal in Fig. [T] 
correspond to a constant signal-to-(additive-)noise ratio (SNR), 
defined as SNR = P/a 2 . In contrast to an AWGN channel 
for which the scatter plots along any diagonal would look 
similar, the received constellation points in Fig. [T] start to 
rotate in a deterministic fashion and the effect of the NLPN 
becomes pronounced for large L and P. Therefore, in order 
to specify the operating point of the channel, the SNR alone 
is not sufficient, because the parameter space of the channel is 
two-dimensional, cf. fl2] Sec. VII]. In this paper, we present 
performance results assuming a fixed fiber length and variable 
transmit power. 

B. Amplitude-Phase Shift Keying 

We use the term APSK for the discrete-input constellations 
considered in this paper and focus on constellations with M = 
4, 8, and 16 points. The APSK signal set is defined as (5) 



X = 



ffee 



1 < k < N, < j < Z fc - 1 



(3) 

where N denotes the number of amplitude levels or rings, 
the radius of the fcth ring, Ik > 1 the number of points in 
the fcth ring, where X)fe=i h — M, and p k the phase offset 
in the fcth ring. Throughout the paper, we assume a uniform 
distribution on the channel input X over all symbols, and thus 
P = (1/M) X^fcLi ^ r fc- The radii are assumed to be ordered 
such that r\ < ... < rjy and we use r = (ri, . . . , t\w) to 
denote the radius vector. In this paper, for = 1, the point in 
the first ring is always placed at the origin, implying r x = 0. 
The radius vector is said to be uniform if r^+i — r k = A for 
1 < k < N - 1, where A = r 2 if h = 1 and A = r x if 
h > 2. The symbols are assumed to be indexed, i.e., Xi £ X, 
i = 1, . . . , M. The indexing is done starting in the innermost 
ring (fc = 1) by increasing j from to li — 1 and then moving 
to the next ring increasing j from to I2 — 1 and so on. Thus, 
finally we have xi = ne 3 *? 1 ,. ..,x M = rA re j27r(/lv - l)/Zjv+VJV . 

We also define the vectors I = (li, . . . ,1^) and cp = 
{<fi, . . . , (Pn), and use the notation i-APSK for an APSK 
constellation with N rings and Ik points in the fcth ring, e.g., 
(4,4,4,4)-APSK. Note that this notation does not specify a 
particular constellation without ambiguity, due to the missing 
information about the radii and phase offsets. 

III. Detection Methods 

A. Symbol Error Probability 

Let TZi, 1 < i < M, be the decision region implemented by 
a detector for the symbol Xi, i.e., X = Xi if Y E TZi, where 
X denotes the detected symbol. The average SEP is then 



SEP 



1 



1 > Pi^i, 

i=l 



(4) 



where P^j = Pr[X = Xj\X = xi\, 1 < i,j < M, are the 
symbol transition probabilitie^] calculated as 



Pi-tj — 



1Z , 



fy\x= Xi {y) d v- 



(5) 



'For the SEP, only the cases j = i, 1 < i < M, need to be considered. 
In Sec. |v] all transition probabilities are used. 



That is, Pi-fj is obtained through integration of the conditional 
PDF of the observation given the channel input X = Xi over 
the detector region for Xj. 



B. Maximum Likelihood Detection 

Let the polar representation of the channel input and the 
observation be given by X = R^e^ and Y — R&> , 



respectively. The PDF of Y can be written in the form [12 
Sec. Ill], p] p. 225], (15) 



fy\x=x{y) 



m— 1 

(6) 

where x — r^e*® , y — re J °, is the real part of z e C, 

and the PDF of the received amplitude R given the transmitted 
amplitude Rq = rn. is 



fR\R =r ( r ) 



2r 

— exp 



(7) 



where Io ( • ) is the modified Bessel function of the first kind. 
Analytical expressions for the coefficients C m (r, tq) can be 
found in 1 12 Sec. III]. The ML detector can now be described 
in the form of decision regions TZ^ h C C for each symbol 

Xi £ X: 

M 

K L = f| & e c : fy\x=** (f) ^ fy\x=** (»)}• (8) 

If we take the ML decision regions defined in dSJ, then (|4]) 
becomes a lower bound on the achievable SEP with suboptimal 
detectors. 

Evaluating the SEP by numerically integrating over the 
PDF |6]l is computationally expensive. Moreover, unlike for 
an AWGN channel, where the ML regions simply scale 
proportionally to \P, the ML decision regions defined in ([8]) 
change their shape based on the transmit power P [15|. This 
renders ML detection rather impractical for the purpose of 
constellation optimization. 

C. Two-Stage Detection 

In this paper, we study a slightly modified version of the 
suboptimal TS detector proposed in \ \5\. This detector is a 
practical alternative to the ML detector because it has much 
lower complexity. In particular, the TS detector employs one- 
dimensional decisions: First in the amplitude direction (first 
detection stage), followed by a phase rotation, and then in the 
phase direction (second detection stage). 

In Fig. [2] we show a block diagram of the TS detector. 
We refer to the first detection stage together with the phase 
rotation as postcompensation. Based on the absolute value of 
the observation R, radius detection is performed. In contrast 
to fl5) and p6) , we use maximum a posteriori (MAP) instead 
of ML radius detection and make use of the a priori prob- 
ability for selecting a certain ring at the transmitter, thereby 
achieving a small performance advantage. The radius detector 
implements the rule: Choose Rq — r^, when /ifc_i < R < fik, 
where /i/j, < fc < N, denote the decision radii or thresholds. 
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Fig. 2. Block diagram of the TS detector. Note that the depicted postcom- 
pensation of Y to Y is reversible. 



The MAP decision threshold /Lt^, 1 < fe < JV" — 1, between r k 
and rfe+i is obtained by solving 

Pr [Rq = r k ] f R \R 0=rk (fJ-k) = Pr [Rq = r k+1 ] f R \ Ro=rk+1 (/Xfc), 

(9) 

where the a priori probabilities are given by Pr [R = r k ] = 
always define [1q — and fix = oo. Based on the 
radius Rq of the detected ring and the received amplitude R, 
a correction angle 9 C is calculated, by which the observation 
Y is rotated to obtain the postcompensated observation Y as 
shown in Fig. [2] The correction angle is given by 



9 c (R,Rq) — —ZCi(R,Ro), 



(10) 



which is approximately a quadratic function in R fl~5) . 

The second detection stage is then performed with respect 
to Y: A phase detector chooses the constellation point with 
radius Rq that is closest to Y. Graphically, the TS detector 
employs so called annular sector regions (or annular wedges) 
as decision regions for Y. 

D. PDF of the Postcompensated Observation 

It is shown in fl5] that for PSK signal sets (which, in this 
paper, are denoted by (M)-APSK), where R = \JP = const., 
a minimum-distance detector for Y is equivalent to ML de- 
tection. In contrast, for constellations with multiple amplitude 
levels, the receiver structure in Fig. [2] does not perform ML 
detection. However, in principle, optimal detection of X is still 
possible based on Y due to the fact that the postcompensation 
in Fig. [2] is invertible and every invertible function forms a 
sufficient statistic for detecting X based on Y pTj p. 443]. 
Thus, the performance loss associated with the TS detection 
scheme originates solely from suboptimal detection regions, 
not from the postcompensation itself, which is a lossless 
operation^] 

In the following, we show how the PDF of the postcompen- 
sated observation Y can be computed. This PDF can then be 
used to find optimal detection regions for Y. It is clear from 
the block diagram of Fig. [2] that the PDF can be written as 

/v.v M '•• • U.*-*.l*.*J 



fy\x=x ( V 



(11) 



2 Solving j9) for fik can be done numerically and for an approximate 
analytical solution assuming that r k ^ 0, one may apply the high-SNR 
approximation Irj (x) ~ - " 

3 An important question that we do not address is whether the phase rotation 
\W\ is still the best choice for multilevel constellations, assuming that one is 
constrained to straight-line phase decision boundaries for Y . 
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Fig. 3. PDF of Y for P = —5 dFJm and L = 5500 km conditioned on 
one particular point in each ring of the uniform (4,4,4,4)-APSK constellation. 
Color is helpful. 



Most importantly, the correction angle 9 C is a discontinuous 
function of the amplitude R because it depends on the detected 
ring Rq. In general, the correction angle can be written as 



9 c {R, ri ) if h i <R<Vi 



9 c (R,Ro) — 



(12) 



c(R,Tn) 



if Mjv-i < R < 



For illustration purposes, we plot in Fig.[3]the PDF resulting 
from (jTTJ and ( fT2"l > conditioned on one particular point in 
each ring of the uniform (4, 4, 4, 4)-APSK constellation^] If 
we consider the PDF conditioned on X = r^, i.e., Rq = ?*2 
and 9o = (shown in red), it can be observed that the 
contour lines look as though they have been sliced up along 
the decision radii of the radius detector. For R < /ii, the 
correction angle is calculated with respect to r\, and thus, the 
phase is undercompensated. On the other hand, for R > 
the wrongly detected radius results in an overcompensation. 

E. Performance Comparison 

For a qualitative performance comparison between the dif- 
ferent detectors, in Fig. ^ a) the PDF in |6]l is plotted for the 
uniform (4, 4, 4, 4)-APSK constellation together with the ML 
decision regions. In Fig. ^ b) the PDF in (jTTJ is used instead. 
Finally, Fig. |^|c)| shows the same PDF as Fig. fflb)| together 
with the suboptimal decision regions implemented by the TS 
detector. As an example, the decision regions corresponding 
to the point X — r^e^/ 2 are shaded in Fig. [4] 

Comparing the optimal decision regions in Fig. |^[b)| with the 
decision regions in Fig. 



c) 



it can be seen that TS detection 
is clearly suboptimal for this constellation and input power. 
However, one would expect the two small shaded regions in 
Fig. ^ b) to become smaller for higher power. Intuitively, this is 
explained by the increasing accuracy of the radius detector for 

increasing transmit power, due to the Rice distribution (|7| of 

(e) " 



as was done in II 61 for the ML radius detector, the amplitude. More precisely, let P^. ' = Pr[i? 7^ Rq\Ro = 

4 The PDFs of the points which are not shown look identical to the PDF of 
the corresponding point in the same ring up to a phase rotation. 



(a) ML decision regions w.r.t. Y 



(b) ML decision regions w.r.t. Y 



(c) TS decision regions 



Fig. 4. Decision regions for the uniform (4, 4, 4, 4)-APSK constellation at 

rfc] be the probability of an error in the first detection stage, 
given that a symbol in the fcth ring is transmitted. Then 

P ( k ] = 1 - (Qi (f fc> Afc-i) - Qi (r fcl Afc)) , d3) 

where Qi ( • , • ) is the Marcum Q-function, fk — y2ri-/a, 
and jEtfc = v2/-tfc /&. It follows that the SEP under TS detection 
converges to the SEP under ML detection for increasing input 
power and any APSK constellation with uniform radius vector, 
since then Pjf , 1 < k < N, tends to zero as P increases. 

IV. Constellation Optimization 

A. Problem Statement 

In this section, we optimize the parameters of APSK 
constellations by minimizing the SEP under TS detection. 
Formally, the optimization problem can be stated as: Given 
M, P, and L, 

minimize SEP under TS detection (14) 

l.r 

subject to 1 <N < M (15) 

h + ... + l N = M (16) 

hrl + ... + l N r 2 N = PM (17) 

h > 1, for 1 < k < N. (18) 

The objective function can be computed analytically using Q 
with the PDF ( fTT) integrated over the TS detector regions fl8] 
Eq. (4)]. Note that the phase offset vector ip does not appear in 
the minimization problem. This is due to the fact that the SEP 
under TS detection does not change assuming a phase offset 
in any of the constellation rings: The PDF of Y is simply 
rotated by the phase offset, but so is the detector region itself, 
and hence the integrals in <|5j are not affected. 

It is instructive to begin by discussing two special cases 
of the general optimization problem above. The first case is 
obtained when r is assumed to be uniform and an optimization 
is performed only over the number of points in each ring I, cf. 
(fT~4|> — (fT~8^>. The optimization problem then becomes an integer 
program which can be solved in an exhaustive fashion for 
the constellation sizes considered in this paper. The number 



= —4 dBm. Shaded regions correspond to the point X = r^e 3 ^' 2 . 

of ways to distribute M constellation points to i rings is 
given by CfSi)- At most, there are M rings, which gives 

a total of j^iLi CI-i) = 2 M_1 possibilities to choose I. 
There are 8, 128, and 32768 possibilities for 4, 8, and 16 
points, respectively. It is clear that such a brute-force approach 
becomes unfeasible for larger constellations. However, based 
on the obtained results, it might be possible to devise more 
sophisticated search methods for M > 16, e.g., by neglecting 
unrealistic combinations a priori. 

The second special case is given by the optimization of the 
radius vector r for a certain constellation with fixed parameter 
I. For this case, the optimization problem becomes a nonlinear 
program. Due to the power constraint, the dimensionality of 
the search space is N — 1 if l\ > 1 and N — 2 if l\ = 1, 
respectively. By inspecting the target function, one can verify 
that this problem is nonconvex, i.e., a local optimum does 
not necessarily imply a global solution. We tested different 
nonlinear solvers and obtained very good solutions with the 
Nelder-Mead simplex method [22]. Even though the solution 
is not guaranteed to be the globally optimal, we verified 
the global optimality for several constellations and several 
combinations of input power and fiber length with a brute- 
force grid search. 

By combining the solutions to the two special cases above, 
a solution to the general optimization problem can then be 
found as follows: For each APSK constellation with a certain 
fixed parameter I one determines the optimal radius vector via 
the simplex method for a given input power and fiber length, 
and then optimizes over all possible I. We call this approach 
joint optimization. 

B. Results and Discussion 

1)4 Points: We start by finding optimal APSK constella- 
tions with M = 4 points. The fiber length is fixed at L = 
7000 km and the input power P is varied from —15 dBm to 
5 dBm in steps of 0.5 dBm. In Fig.|5]we plot the performance 
of all possible eight 4-point APSK constellations with optimal 
radius vector (dashed lines) and the curves are labeled with the 
corresponding I. (For (4)-APSK and (1,3)-APSK the radius 
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Fig. 5. Results for the constellation optimization with M = 4. The fiber 
length is L = 7000 km. The dash-dotted line is a reference curve, showing 
the SEP of 4-QAM in an AWGN channel under ML detection. 



vector is fixed, i.e., no optimization is performed.) The results 
of the joint optimization are shown with markers. We also 
show the SEP of (4)-APSK (or 4-QAM) in an AWGN channel 
under ML detection as a well-known reference curve (dash- 
dotted line). Note that for each P the SEP is minimized by a 
constellation with certain parameters I and r. For example, it 
can be seen that for an input power range between —15 dBm 
and -7.5 dBm, (4)-APSK is optimal. 

Based on the behavior of the SEP for the individual APSK 
constellations with optimized radius vector (dashed lines in 
Fig. [5j, it is possible to classify the constellations into three 
classes. The first class exhibits a well known performance 
minimum, i.e., an optimal operating power. The second class 
does not exhibit a minimum, but eventually flattens for very 
high input power (see, e.g., the performance of (3, 1)-APSK). 
The third class exhibits a performance behavior which is 
strictly and steadily decreasing with increasing input power. 

The flattening of the SEP for the second class is explained 
by the availability of a sacrificial point in the outer constella- 
tion ring. The meaning of the term sacrifical is best explained 
with the help of an example. In Fig. [6] we show the results of 
the radius optimization for (3, 1)-APSK (top), together with 
the optimal values of the parameter r (bottom). It can be 
observed that, for P > — 3 dBm, the optimal ring spacing 
shows a very peculiar behavior. For this power regime, r± 
appears to be fixed and any increase in average power is 
absorbed by putting the outermost point further away from the 
inner ring. In some sense, the outer point (experiencing very 
high NLPN) is sacrificed with the result of saving the average 
SEP of the constellation. Observe that (1, 1, 1, 1)-APSK is the 
only APSK constellation with 4 points that belongs to the third 
clas^Jand it was already argued in fl5) , that this constellation 




P [dBm] 




Fig. 6. Performance of the (3, 1)-APSK constellation with a uniform and 
optimized radius vector (top) and the corresponding radius vector (bottom). 



is optimal for very high input powerj^] 

The system parameters are chosen in such a way that the 
obtained results are directly comparable to the performance 
of the optimized constellations presented in fl3] Sec. IV]. To 
facilitate a comparison, in Fig. [7] we provide a digitalized 
version of fl"5| Fig. 15] and plot the outcome of the joint 
optimization in the same figure. All four constellations used in 
(15) can be seen as APSK constellations and they are depicted 
in Fig. [7] for convenience. With the notation introduced in 
this paper, the constellations are (4)-APSK, (1,3)-APSK, 
(2,2)-APSK with ip = (0,</> 2 ), and (1,2,1)-APSK with 
cp = (0,0, ipz). The parameters (p 2 and ip 3 are determined 
by a precompensation technique developed in {15) , while the 
radius vector of the two latter constellations was optimized. 



It is important to point out that the optimization in |15| 
was performed with respect to ML detection, while for the 
joint optimization in this paper the suboptimal TS detector 
is assumed. Notice that, for (4)-APSK these two detection 
schemes are equivalent and hence, the performance results 
taken from (H) for (4)-APSK (constellation (a)) in Fig. [7] 
overlap with the results of the joint optimization for an input 
power up to —7.5 dBm. Comparing the results in Fig.|7J it can 
be seen that the jointly optimized APSK constellations perform 
very close to the optimized constellations in fl5| . For certain 
input powers, e.g., —7 dBm or —5 dBm, a performance loss is 
visible, which is explained by the weaker detection technique. 



For APSK constellations with only one point in each ring the SEP under 



5 The SEP for (2, 1, 1)-APSK in Fig. |5] flattens for very high input power. TS detection can be calculated using j 1 3| as SEP = jj X]fc=i Pf 
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Fig. 7. Comparison between the joint optimization of the 4-point APSK 
constellation and the results for optimized constellations (a)-(d) in [15] under 
ML detection. The legend shows the constellation names originally used in 



On the other hand, for some power regimes, e.g., —3.5 dBm or 
—2 dBm, the jointly optimized constellations perform as well 
as, or even outperform, the best constellations presented in 
We attribute this performance gain to the more systematic 
search which is offered by the APSK framework. Also note 
that there is no need to find phase precompensation angles 
as was done in fl5) , because those are relevant only for 
ML detection, but irrelevant for the performance under TS 
detection. This removes one degree of freedom from choosing 
a constellation and makes the optimization simpler. As a last 
point, it is unclear why constellation (d) in |15] does not 



Fig. 8. Results for the constellation optimization with M = 8. The fiber 
length is L = 7000 km. 



exhibit a flattening SEP for very high input power, even though 
the radius vector was optimized and a sacrificial point is 
available. We conjecture that the SEP results for constellation 
(d) for 3 dBm and 5 dBm are only locally optimal. 

2) 8 Points: For M = 8 points, we present optimization 
results for the same system parameters as before. The results 
for the joint optimization are shown in Fig. [8] with markers. 
Since it is not very instructive to show the performance of all 
8-point APSK constellations in the same figure, we only plot 
the SEP of those constellations that are optimal somewhere in 
the considered power range (dashed lines). Thus, the parameter 
I is indicated by the labeling of the corresponding dashed line. 
To avoid cumbersome notation, we also define 1 jv = 1 , • • ■ , 1 
(N times). To get a more intuitive feeling for the optimal 
constellations in different power regimes, the inset figures 
show the actual constellations that are optimal at — 12 dBm, 
— 7.5 dBm, —4.5 dBm, and — 1 dBm. The constellations are 
shown with their optimized radius vector for the corresponding 
input power. 

We also perform an optimization only over I assuming 
that the radius vector is uniform and the results are depicted 
in Fig. [9] together with the results of the jointly optimized 
constellations. The dashed lines in Fig. [9] show the SEP of the 
APSK constellation with a fixed parameter I and assuming a 




Fig. 9. Results for the constellation optimization with M = 8 (white circles) 
assuming that a uniform radius vector for all constellations. The fiber length 
is L = 7000 km. The results of the joint optimization from Fig. [8] (black 
markers) are shown for comparison. The dashed lines correspond to the SEP 
of the constellations that are indicated by the labels. 



uniform radius vector. An important observation here is that up 
to an input power of —3.5 dBm, there is almost no difference 
between the performance of the jointly optimized constel- 
lations and the optimal constellations obtained assuming a 
uniform radius vector, suggesting that most of the performance 
improvement is due to optimizing the parameter I. 

3) 16 Points: Motivated by the results obtained for M = 8, 
for M = 16 points, we limit ourselves to the case where the 
radius vector is assumed to be uniform for all constellations. 
The fiber length is L = 5500 km and the input power P is 
varied from —14 dBm to 10 dBm in steps of 2 dBm. The 
results are shown in Fig. 
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Fig. 10. Results for the constellation optimization with M = 16. The 
SEP under TS detection for 16-QAM, uniform (4, 4, 4, 4)-APSK, and (li 6 )- 
APSK are shown for comparison. The SEP under ML detection for 16-QAM 
in an AWGN channel (dash-dotted line) is shown as a reference. 



The individual SEP under TS detection for 16-QAM, uniform 
(4, 4, 4, 4)-APSK, and (li 6 )-APSK are shown for comparison, 
while the SEP under ML detection for 16-QAM in an AWGN 



channel is shown as a reference. The results in Fig. 10 show 
that up to an input power of —8 dBm, the performance of the 
optimized constellations follows closely the performance of 
16-QAM in AWGN. In other words, for this power regime it 
is possible to find APSK constellations with TS detection that 
perform as well as 16-QAM for a channel without nonlinear 
impairments under ML detection. For higher input power, 
the optimized constellations gradually utilize more amplitude 
levels, due to the increase in NLPN. If we take as a baseline 
the minimum SEP achieved by the 16-QAM constellation 
(P w -2.8 dBm and SEP « 1CT 2 ), and interpolate the 
optimal APSK performance for the same SEP, we observe 
that a performance gain of 3.2 dBm is achieved. 

In order to verify the assumption that a joint optimization 
approach does not lead to significant performance gains, we 
also perform a reduced complexity approach to the joint 
optimization, where the optimization is restricted to con- 
stellations with at most N = 6 rings. This makes the 
results meaningful only for low and moderate input power 
(P < dBm), because, as we have seen previously, for higher 
input power, the dominance of the NLPN dictates the use 
of more amplitude levels to achieve good performance. We 
found that for P < dBm, the jointly optimized constellations 
with the additional constraint N < 6 coincide with the 
constellations obtained for a uniform radius vector, confirming 
that the joint optimization only yields negligible performance 
improvements for this power regime. For higher input power, 
the obtained jointly optimized constellations perform worse 
than the optimal constellations obtained with a uniform radius 
vector, which is simply due to the restriction to six rings. 

Finally, the phenomenon of sacrificial points discussed pre- 
viously also generalizes to entire rings, i.e., when optimizing 




1.2 - 




Fig. 1 1 . Illustration of a sacrificial ring which occurs for the radius 
optimization of (4, 4, 4, 4)-APSK for P > 4 dBm. The system length is 
L = 7000 km. 



the radius vector of constellations with more than one point in 
the outer ring. In this case, however, the SEP still exhibits a 
minimum. As an example, in Fig.[TT]we show the result of the 
radius optimization for (4, 4, 4, 4)-APSK (top) together with 
the optimal radius vector (bottom). The radius vector r is plot- 
ted in a normalized fashion f = rj\[P (i.e., Yli=i k^f = 1), 
so that the effect is more clear. It can be observed that up 
to an input power of 4 dBm the distance between any two 
adjacent rings for the optimal radius vector is approximately 
the same. Moreover the distance decreases with higher input 
power (like "squeezing accordion pleats" (23)). However, for 
P > 4 dBm the optimal radius vector changes significantly. 
Somewhat counterintuitively, in this power regime, is is better 
to place the four points in the outer constellation ring far away 
from the other rings. 

V. Binary Labelings 

In order to allow for the transmission of binary data, we 
assume that each symbol x\ € X is labeled with a binary vec- 
tor Ci = (c M , . . . , Ci >m ) € {0, 1}™, where m = log 2 M. The 
different binary vectors are the binary representations of the in- 
tegers {0,1,...,M-1}QA specific mapping between vectors 
and constellation symbols is called a binary labeling, which 



will be denoted by an M x m matrix L m = (cj , . . . , c^) 2 . 

A Gray labeling is obtained if the binary vectors of neigh- 
boring symbols, i.e., symbols that are closest in terms of 
Euclidean distance, differ by only one bit position. As an 
example, the Gray labeling G m for (M)-APSK constellation^] 

7 One may arbitrarily choose c^i as the most significant bit. 

8 Different Gray labelings exist for a given constellation and for simplicity 
we restrict ourselves to G m , which is referred to as the binary reflected Gray 
code (BRGC) in the literature: It is the provably optimal Gray labeling for 
PSK constellations in an AWGN channel at high SNR (24). 
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may be constructed by m — 1 recursive reflections of the 
trivial labeling Gi = (0, 1) T . To obtain G m +i from 



by reflection, generate the matrix [c{ 



"ro+1 
T T 
■ ' C M i C M i ' 



=57 



and add an extra column from the left, consisting of M zeros 
followed by M ones J24| . 

A. Z?/f Error Probability 

The average BEP of the signal constellation is given by 

M M 

J^dHfe.CjO-Pi-tf, (19) 
= 1 3=1 

where <f H ( • , • ) denotes the Hamming distance between two 
binary vectors. A useful bound for the BEP directly follows 
from 1 < d\i(ci : Cj) < to, i ^ j, and is given by: 

SEP 



BEP = 



1 

toM 



< BEP < SEP. 



(20) 



The probabilities Pi-yj are fixed for a given constellation 
and P and L (cf. (|5]l) and thus, ( fT9] i only depends on the 
labeling. However, it is important to realize that the phase 
offset vector tp has an effect on these probabilities for j ^ 
i. Therefore, even though two APSK constellations with the 
same I and r but different cp have the same SEP, they may 
have a different BEP. In the following, we show how to exploit 
this new degree of freedom for a class of APSK constellations 
that permit the use of a Gray labeling. 

B. Rectangular APSK 

APSK constellations with 1<N<M, <p = and I = 
Ik = M/N, 1 < k < N, have a "rectangular" structure when 
plotted in polar coordinates. For these constellations, a Gray 
labeling is given by L m = Gi og2 n S3 Gi og2 u where (g> is the 
ordered direct product, defined as 

lT\T 



where c. 



ici,. 



-pqj 



(21) 



qi+j 



(a,i,bj), for 1 < i < p and 1 < j < q. 
This amounts to independently choosing a Gray labeling for 
the radius and phase coordinates of the constellation and then 
concatenating the binary vectors. In Fig. [12] an example for 
such a construction is shown. 

Gray labelings ensure that for a standard AWGN channel 
the BEP closely approaches the lower bound in ( p0| > for high 
SNR. Using the same labeling in a nonlinear channel, however, 
does not necessarily ensure good performance: The problem is 
that the binary labeling is constructed without considering the 
underlying PDF of the observation. From Fig. [3] it is evident 
that this PDF has a rather unusual shape, due to the slicing 
effect caused by the postcompensation. To further illustrate 
this point, consider (4, 4)-APSK with r 1 /r 2 = 0.424 and 
tp = (0,0), operating at L = 7000 km and P = -5 dBm, 
which is the optimal APSK constellation for these parameters 
(cf. Fig. [sl. The PDF of Y conditioned on one point in each 
rin^] is evaluated and plotted in polar coordinates, as shown 
in Fig. 13 a) for x% and xr- The solid lines correspond to 



The PDFs conditioned on all other points can be obtained by a phase 
translation. Note that the PDF is periodic in phase. 
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Fig. 12. Illustration of the labeling G 2 ® <G 2 for (4, 4, 4, 4)-APSK. In the 
blueprint (a), the polar coordinates of the points are labeled independently. 
The final labeling in (b) is achieved by concatenating the binary vectors in 
the blueprint. 
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Fig. 13. In (a), the PDF of Y is shown for (4, 4)-APSK conditioned on 
X = x 3 (blue) and X = x 7 (red) for L = 7000 km and P = -5 dBm. 
Solid lines correspond to the decision boundaries of the TS detector and dotted 
lines show connections between neighboring symbols according to the PDF. 



the decision boundaries of the TS detector. Recalling that the 
symbol transition probabilities are obtained through integration 
of the PDF over the detector regions (cf. Q), Fig. 
can then be used to identify "neighboring symbols'' 



13 a) 



of x 3 

and x-j (and consequently of all points) in the sense that the 
corresponding transition probabilities will dominate in ( [19) 1. 
The main observation here is that, even though X3 and £7 
are adjacent symbols in radial direction, the corresponding 
transition probabilities, i.e., P^j and are negligible 

compared to P3_j.s an d fV->2, respectively. The dotted lines in 



Fig. 13 a) show appropriate connections between neighboring 
symbols taking into account the nonlinear PDF — similar to 
the dotted lines in Fi g. [T2"| a) which are appropriate for a 
Gaussian PDF. In Fig. |13[b) the (4, 4)-constellation is shown 
with the Gray labeling Gi<8> G2 (top) and the modified labeling 
(bottom) that results from "following" the dotted lines in 
Fig. 13 a) and concatenating the binary vectors. Observe that 
the labeled constellation in the bottom of Fig. |T3jb) may be 
obtained from the Gray labeled constellation by using a phase 
offset vector of ip = (0, 7r/2): In this case, the non-zero phase 
offset in the second ring does not change the constellation (i.e., 
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but rather leads to a different indexing 
II-B I, and consequently to a different 



the set of symbols), 
of symbols (cf. Sec. 
mapping between symbols and binary vectors. 

Going one step further, we now allow for arbitrary phase 
offsets in all constellation rings[^] with the intention to "steer" 
the phase decision boundaries such that they are roughly 
symmetric around the PDF. Starting from a Gray labeled 
rectangular APSK constellation, a simple way to achieve this 
is by initializing ip-y = and then calculating 



Oc(lH-i,n-i) - 0c(Mi-l> r i) + <Pi 



(22) 



for i = 2, . . . , N, For the previous example, evaluating (j22j> 
for i = 2 results in <p2 ~ 1.878, corresponding to the length 
of the dashed, gray arrow in 13 a). The two dashed, gray 
lines are the new phase decision boundaries for x-j and it can 
be seen that they appear roughly symmetric around the blue 
PDF. We highlight that this proposed method to determine the 
phase offset vector may be applied to any rectangular APSK 
constellation of arbitrary constellation size, provided that M 
is a power of 2. 
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Fig. 14. Average BEP and lower bounds (LB) for M = 8 and L = 7000 km 
for different APSK constellations. The subfigures show the optimized APSK 
constellations for the corresponding input power. 



C. General Case 

For APSK constellations with arbitrary I, there seems to be 
little exploitable structure in order to constructively "design" 
good binary labelings. It was pointed out in [25 1 that the label- 
ing problem falls under the category of quadratic assignment 
problems, and as such, is NP-hard. For M — 4 and M = 8, it 
is, however, easily possible to exhaustively search for optimal 
labelings. Note that for a constellation with M points, M! 
different binary labelings exist, although not all labelings lead 
a different BEP (for example, swapping bit positions does not 
affect the BEP). 

D. Results and Discussion 

First, we demonstrate the performance improvement that 
can be achieved with the proposed method to choose the 
phase offset vector in ( f22] ). In Fig. [14] the lower bound 
(LB) in (f20| is plotted for (4,4)-APSK with optimized r, 
L = 7000 km, and P varying from —10 dBm to —4 dBm 
in steps of 0.5 dBm (black, dashed line). The average BEP of 
the labeled constellation is shown with the proposed phase 
offset vector (red markers) and ip = 0, which results in 
the conventional Gray labeling (blue, dashed line). It can be 
observed that the performance with the proposed method is 
almost indistinguishable from the lower bound ( f2"0] > and a gain 
of approximately 0.4 dB is visible at SEP = 10 -3 over the 
Gray labeling approach. 



Moreover, in Fig. 14 we plot the LB based on the jointly 
optimized APSK constellations with M = 8 (cf. Fig. [8]), where 
the subfigures are provided to show the optimal parameter I for 
the different input powers. For each input power, the optimal 
labeling is determined exhaustively and the corresponding 
BEP is shown by the green line. Note that the LB is tight 
only for the rectangular (4. 4)-APSK, while for the other 

'"Note that an APSK constellation with tp ^ is not necessarily 
rectangular anymore. 



constellations a slight performance loss is visible. These results 
demonstrate that first optimizing the constellation with respect 
to SEP and then choosing an optimal labeling does not 
guarantee to give the best BEP. In particular, for —7 dBm and 
—6.5 dBm, (4,4)-APSK with the proposed phase offset vector 
achieves a lower BEP than the jointly optimized constellations 
(with respect to SEP) with an optimal labeling. 

As a last point, one might argue that the class of rectangular 
APSK constellations is not particularly interesting, because 
they appear rarely as optimal APSK constellations with respect 
to SEP (e.g., for M = 8 they only appear in a small power 
range and for M = 16 they do not appear at all). However, 
the previous results show that if we take the BEP as the main 
figure of merit, rectangular APSK constellations may be ad- 
vantageous in certain power regimes, because they can closely 
approach the lower BEP bound. Moreover, as we described 
earlier, the proposed labeling method is easily applicable to 
any constellation size. It would therefore be relatively simple 
to find optimal rectangular APSK constellations for M > 16 
because in this case not many choices exist. Obviously, these 
constellations might then be far away from the performance 
of the true optimal constellation, but they might still offer a 
significant performance gain over rectangular QAM constel- 
lations in the presence of NLPN, with the advantage that a 
constructive labeling method is readily available. 

VI. Conclusion 

In this paper, we optimized APSK constellations for a fiber- 
optical channel model without dispersion. It was shown how to 
derive the PDF of the postcompensated observation assuming a 
TS detection scheme. The PDF was used to gain insight into 
the performance behavior with respect to optimal detection 
and to calculate the BEP. Optimal APSK constellations under 
TS detection have been presented. For M — 16 constellation 
points, significant performance improvements in terms of SEP 
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can be achieved by choosing an optimized APSK constellation 
compared to a baseline 16-QAM constellation. For very high 
input power, sacrificing points or constellation rings may 
become beneficial. Finally, the binary labeling problem was 
discussed and a constructive labeling method was presented, 
which is applicable to rectangular APSK constellations. An 
important topic for future work would be the investigation of 
the influence of fiber chromatic dispersion and nonlinearities 
on the optimal signal set. 
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